AITopics | Westlock County

Collaborating Authors

Westlock County

LegalWebAgent: Empowering Access to Justice via LLM-Based Web Agents

arXiv.org Artificial IntelligenceDec-5-2025

Access to justice remains a global challenge, with many citizens still finding it difficult to seek help from the justice system when facing legal issues. Although the internet provides abundant legal information and services, navigating complex websites, understanding legal terminology, and filling out procedural forms continue to pose barriers to accessing justice. This paper introduces the LegalWebAgent framework that employs a web agent powered by multimodal large language models to bridge the gap in access to justice for ordinary citizens. The framework combines the natural language understanding capabilities of large language models with multimodal perception, enabling a complete process from user query to concrete action. It operates in three stages: the Ask Module understands user needs through natural language processing; the Browse Module autonomously navigates webpages, interacts with page elements (including forms and calendars), and extracts information from HTML structures and webpage screenshots; the Act Module synthesizes information for users or performs direct actions like form completion and schedule booking. To evaluate its effectiveness, we designed a benchmark test covering 15 real-world tasks, simulating typical legal service processes relevant to Québec civil law users, from problem identification to procedural operations. Evaluation results show LegalWebAgent achieved a peak success rate of 86.7%, with an average of 84.4% across all tested models, demonstrating high autonomy in complex real-world scenarios.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.04105

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > Canada > Alberta > Census Division No. 13 > Westlock County (0.04)
North America > Canada > Alberta > Census Division No. 11 > Sturgeon County (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

The Effect of Enforcing Fairness on Reshaping Explanations in Machine Learning Models

Anderson, Joshua Wolff, Visweswaran, Shyam

arXiv.org Artificial IntelligenceDec-3-2025

Trustworthy machine learning in healthcare requires strong predictive performance, fairness, and explanations. While it is known that improving fairness can affect predictive performance, little is known about how fairness improvements influence explainability, an essential ingredient for clinical trust. Clinicians may hesitate to rely on a model whose explanations shift after fairness constraints are applied. In this study, we examine how enhancing fairness through bias mitigation techniques reshapes Shapley-based feature rankings. We quantify changes in feature importance rankings after applying fairness constraints across three datasets: pediatric urinary tract infection risk, direct anticoagulant bleeding risk, and recidivism risk. We also evaluate multiple model classes on the stability of Shapley-based rankings. We find that increasing model fairness across racial subgroups can significantly alter feature importance rankings, sometimes in different ways across groups. These results highlight the need to jointly consider accuracy, fairness, and explainability in model assessment rather than in isolation.

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2512.02265

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Oceania > Guam (0.04)
North America > United States > Alaska (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.69)
Research Report > New Finding (0.68)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.70)

Add feedback

RAG System for Supporting Japanese Litigation Procedures: Faithful Response Generation Complying with Legal Norms

Ishihara, Yuya, Keyaki, Atsushi, Yamada, Hiroaki, Ohara, Ryutaro, Sumida, Mihoko

arXiv.org Artificial IntelligenceDec-1-2025

This study discusses the essential components that a Retrieval-Augmented Generation (RAG)-based LLM system should possess in order to support Japanese medical litigation procedures complying with legal norms. In litigation, expert commissioners, such as physicians, architects, accountants, and engineers, provide specialized knowledge to help judges clarify points of dispute. When considering the substitution of these expert roles with a RAG-based LLM system, the constraint of strict adherence to legal norms is imposed. Specifically, three requirements arise: (1) the retrieval module must retrieve appropriate external knowledge relevant to the disputed issues in accordance with the principle prohibiting the use of private knowledge, (2) the responses generated must originate from the context provided by the RAG and remain faithful to that context, and (3) the retrieval module must reference external knowledge with appropriate timestamps corresponding to the issues at hand. This paper discusses the design of a RAG-based LLM system that satisfies these requirements.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.22858

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.15)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Law > Litigation (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Structured Definitions and Segmentations for Legal Reasoning in LLMs: A Study on Indian Legal Data

Khatri, Mann, Yusuf, Mirza, Shah, Rajiv Ratn, Kumaraguru, Ponnurangam

arXiv.org Artificial IntelligenceNov-27-2025

Large Language Models (LLMs), trained on extensive datasets from the web, exhibit remarkable general reasoning skills. Despite this, they often struggle in specialized areas like law, mainly because they lack domain-specific pretraining. The legal field presents unique challenges, as legal documents are generally long and intricate, making it hard for models to process the full text efficiently. Previous studies have examined in-context approaches to address the knowledge gap, boosting model performance in new domains without full domain alignment. In our paper, we analyze model behavior on legal tasks by conducting experiments in three areas: (i) reorganizing documents based on rhetorical roles to assess how structured information affects long context processing and model decisions, (ii) defining rhetorical roles to familiarize the model with legal terminology, and (iii) emulating the step-by-step reasoning of courts regarding rhetorical roles to enhance model reasoning. These experiments are conducted in a zero-shot setting across three Indian legal judgment prediction datasets. Our results reveal that organizing data or explaining key legal terms significantly boosts model performance, with a minimum increase of ~1.5% and a maximum improvement of 4.36% in F1 score compared to the baseline.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.20669

Country:

North America > Canada > Alberta > Census Division No. 13 > Westlock County (0.24)
North America > Canada > Alberta > Census Division No. 11 > Sturgeon County (0.24)
Europe > United Kingdom (0.14)
(8 more...)

Genre: Research Report > New Finding (1.00)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction

Xu, Jun, Du, Xinkai, Ao, Yu, Zhao, Peilong, Li, Yang, Zhong, Ling, Yuan, Lin, Bo, Zhongpu, Wang, Xiaorui, Sun, Mengshu, Gui, Zhengke, Zhang, Dalong, Wang, Zhaoyang, Wang, Qiwei, Hou, Yangyang, Yin, Zhiying, Wang, Haofen, Chen, Huajun, Liang, Lei, Zhou, Jun

arXiv.org Artificial IntelligenceNov-17-2025

Efficient retrieval of external knowledge bases and web pages is crucial for enhancing the reasoning abilities of LLMs. Previous works on training LLMs to leverage external retrievers for solving complex problems have predominantly employed end-to-end reinforcement learning. However, these approaches neglect supervision over the reasoning process, making it difficult to guarantee logical coherence and rigor. To address these limitations, we propose Thinker, a hierarchical thinking model for deep search through multi-turn interaction, making the reasoning process supervisable and verifiable. It decomposes complex problems into independently solvable sub-problems, each dually represented in both natural language and an equivalent logical function to support knowledge base and web searches. Concurrently, dependencies between sub-problems are passed as parameters via these logical functions, enhancing the logical coherence of the problem-solving process. To avoid unnecessary external searches, we perform knowledge boundary determination to check if a sub-problem is within the LLM's intrinsic knowledge, allowing it to answer directly. Experimental results indicate that with as few as several hundred training samples, the performance of Thinker is competitive with established baselines. Furthermore, when scaled to the full training set, Thinker significantly outperforms these methods across various datasets and model sizes. The source code is available at https://github.com/OpenSPG/KAG-Thinker.

computational linguistic, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.07943

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Austria > Vienna (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
(15 more...)

Genre:

Research Report (0.81)
Workflow (0.68)

Industry:

Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Media > Film (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

PRBench: Large-Scale Expert Rubrics for Evaluating High-Stakes Professional Reasoning

Akyürek, Afra Feyza, Gosai, Advait, Zhang, Chen Bo Calvin, Gupta, Vipul, Jeong, Jaehwan, Gunjal, Anisha, Rabbani, Tahseen, Mazzone, Maria, Randolph, David, Meymand, Mohammad Mahmoudi, Chattha, Gurshaan, Rodriguez, Paula, Mares, Diego, Singh, Pavit, Liu, Michael, Chawla, Subodh, Cline, Pete, Ogaz, Lucy, Hernandez, Ernesto, Wang, Zihao, Bhatter, Pavi, Ayestaran, Marcos, Liu, Bing, He, Yunzhong

arXiv.org Artificial IntelligenceNov-17-2025

Frontier model progress is often measured by academic benchmarks, which offer a limited view of performance in real-world professional contexts. Existing evaluations often fail to assess open-ended, economically consequential tasks in high-stakes domains like Legal and Finance, where practical returns are paramount. To address this, we introduce Professional Reasoning Bench (PRBench), a realistic, open-ended, and difficult benchmark of real-world problems in Finance and Law. We open-source its 1,100 expert-authored tasks and 19,356 expert-curated criteria, making it, to our knowledge, the largest public, rubric-based benchmark for both legal and finance domains. We recruit 182 qualified professionals, holding JDs, CFAs, or 6+ years of experience, who contributed tasks inspired by their actual workflows. This process yields significant diversity, with tasks spanning 114 countries and 47 US jurisdictions. Our expert-curated rubrics are validated through a rigorous quality pipeline, including independent expert validation. Subsequent evaluation of 20 leading models reveals substantial room for improvement, with top scores of only 0.39 (Finance) and 0.37 (Legal) on our Hard subsets. We further catalog associated economic impacts of the prompts and analyze performance using human-annotated rubric categories. Our analysis shows that models with similar overall scores can diverge significantly on specific capabilities. Common failure modes include inaccurate judgments, a lack of process transparency and incomplete reasoning, highlighting critical gaps in their reliability for professional adoption.

benchmark, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2511.11562

Country:

North America > United States > California (0.04)
North America > Canada > Alberta > Census Division No. 13 > Westlock County (0.04)
North America > Canada > Alberta > Census Division No. 11 > Sturgeon County (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Banking & Finance (1.00)
Health & Medicine > Government Relations & Public Policy (0.67)
Law > Litigation (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Beyond Isolated Dots: Benchmarking Structured Table Construction as Deep Knowledge Extraction

Zhong, Tianyun, Mo, Guozhao, Liu, Yanjiang, Chen, Yihan, Kong, Lingdi, Chen, Xuanang, Lu, Yaojie, Lin, Hongyu, Ye, Shiwei, Han, Xianpei, He, Ben, Sun, Le

arXiv.org Artificial IntelligenceOct-31-2025

With the emergence of large language models (LLMs), there is an expectation that LLMs can effectively extract explicit information from complex real-world documents (e.g., papers, reports). However, most LLMs generate paragraph-style answers that are chaotic, disorganized, and untraceable. To bridge this gap, we introduce the Arranged and Organized Extraction Benchmark (AOE), a new bilingual benchmark with data and documents of varying lengths designed to systematically evaluate the ability of LLMs to comprehend fragmented documents and reconstruct isolated information into one organized table. Unlike conventional text-to-table tasks, which rely on fixed schema and narrow task domains, AOE includes 11 carefully crafted tasks across three diverse domains, requiring models to generate context-specific schema tailored to varied input queries. In the experiment, we evaluated both open-source and closed-source state-of-the-art LLMs. The results show that even the most advanced models struggled significantly. The benchmark is available at https://anonymous.4open.science/r/AOE-Benchmark/.

benchmark, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.16271

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(3 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Law (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

When and How to Express Empathy in Human-Robot Interaction Scenarios

Cruz, Christian Arzate, Montiel-Vazquez, Edwin C., Maeda, Chikara, Gomez, Randy

arXiv.org Artificial IntelligenceOct-1-2025

Abstract-- Incorporating empathetic behavior into robots can improve their social effectiveness and interaction quality. In this paper, we present whEE (when and how to express empathy), a framework that enables social robots to detect when empathy is needed and generate appropriate responses. Using large language models, whEE identifies key behavioral empathy cues in human interactions. We evaluate it in human-robot interaction scenarios with our social robot, Haru. Results show that whEE effectively identifies and responds to empathy cues, providing valuable insights for designing social robots capable of adaptively modulating their empathy levels across various interaction contexts. In most scenarios, Large Language Models (LLMs) represent the state-of-the-art approach for classifying empathy [1], [2] and generating empathetic responses [3], [4]. However, the development of robots capable of dynamically adjusting their level of empathy based on the context remains an underexplored area [5]. To this end, we introduce whEE (when and how to express empathy), an empathy framework that provides guidelines on when robots should respond empathetically and how to achieve it. Using our framework, we analyze the utterances of speakers and listeners in dyadic and group conversations with varying levels of empathy. Our analysis identifies key empathy cues that indicate when a speaker seeks an empathetic response and the cues exhibited by listeners displaying high levels of empathy. We approach empathy by focusing on observable behaviors that individuals exhibit when demonstrating an understanding of others' emotions and engaging deeply with their experiences--referred to as behavioral empathy [6].

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2509.252

Country:

Asia > Japan (0.05)
Oceania > Australia (0.04)
North America > Canada > Alberta > Census Division No. 13 > Westlock County (0.04)
(10 more...)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Introducing the A2AJ's Canadian Legal Data: An open-source alternative to CanLII for the era of computational law

Wallace, Simon, Rehaag, Sean

arXiv.org Artificial IntelligenceSep-17-2025

The Access to Algorithmic Justice project (A2AJ) is an open-source alternative to the Canadian Legal Information Institute (CanLII). At a moment when technology promises to enable new ways of working with law, CanLII is becoming an impediment to the free access of law and access to justice movements because it restricts bulk and programmatic access to Canadian legal data. This means that Canada is staring down a digital divide: well-resourced actors have the best new technological tools and, because CanLII has disclaimed leadership, the public only gets second-rate tools. This article puts CanLII in its larger historical context and shows how long and deep efforts to democratize access to Canadian legal data are, and how often they are thwarted by private industry. We introduce the A2AJ's Canadian Legal Data project, which provides open access to over 116,000 court decisions and 5,000 statutes through multiple channels including APIs, machine learning datasets, and AI integration protocols. Through concrete examples, we demonstrate how open legal data enables courts to conduct evidence-based assessments and allows developers to create tools for practitioners serving low-income communities.

information retrieval, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.13032

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > Canada > Ontario (0.05)
North America > United States > Arkansas (0.04)
(6 more...)

Genre: Research Report (0.40)

Industry:

Law > Statutes (1.00)
Government > Regional Government > North America Government > Canada Government (0.94)
Government > Regional Government > North America Government > United States Government (0.93)
(2 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
(2 more...)

Add feedback

All for law and law for all: Adaptive RAG Pipeline for Legal Research

Keisha, Figarri, Singh, Prince, Pallavi, null, Fernandes, Dion, Manivannan, Aravindh, Wicaksono, Ilham, Ahmad, Faisal, Rim, Wiem Ben

arXiv.org Artificial IntelligenceSep-11-2025

Retrieval-Augmented Generation (RAG) has transformed how we approach text generation tasks by grounding Large Language Model (LLM) outputs in retrieved knowledge. This capability is especially critical in the legal domain. In this work, we introduce a novel end-to-end RAG pipeline that improves upon previous baselines using three targeted enhancements: (i) a context-aware query translator that disentangles document references from natural-language questions and adapts retrieval depth and response style based on expertise and specificity, (ii) open-source retrieval strategies using SBERT and GTE embeddings that achieve substantial performance gains while remaining cost-efficient, and (iii) a comprehensive evaluation and generation framework that combines RAGAS, BERTScore-F1, and ROUGE-Recall to assess semantic alignment and faithfulness across models and prompt designs. Our results show that carefully designed open-source pipelines can rival proprietary approaches in retrieval quality, while a custom legal-grounded prompt consistently produces more faithful and contextually relevant answers than baseline prompting. Taken together, these contributions demonstrate the potential of task-aware, component-level tuning to deliver legally grounded, reproducible, and cost-effective RAG systems for legal research assistance.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.13107

Country:

North America > Canada > Alberta > Census Division No. 13 > Westlock County (0.04)
North America > Canada > Alberta > Census Division No. 11 > Sturgeon County (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback